Jolt: Converting bytecode to C The basic idea
public class xxy
{ public void doit(int cnt)
{ for (int i=0; i < cnt; i++)
for (int j = 0; < cnt; j++)
for (int k = 0; < cnt; k++); } }
(after compiling to bytecode and then translated to C) looks like
DEFUN(doit)
{
cp_item_type *_cp;
struct fieldblock *_fb;
struct methodblock *_mb;
int32_t var_0 = _P_[0].i;
int32_t var_1 = _P_[1].i;
int32_t var_2;
int32_t var_3;
int32_t var_4;
int32_t stack_1;
int32_t stack_2;
stack_1 = 0;
var_2 = stack_1;
goto jolt_label_36;
jolt_label_5:
stack_1 = 0;
var_3 = stack_1;
goto jolt_label_28;
jolt_label_10:
stack_1 = 0;
var_4 = stack_1;
goto jolt_label_19;
jolt_label_16:
var_4 += 1;
jolt_label_19:
stack_1 = var_4;
stack_2 = var_1;
if (stack_1 < stack_2)
{goto jolt_label_16;}
var_3 += 1;
jolt_label_28:
stack_1 = var_3;
stack_2 = var_1;
if (stack_1 < stack_2)
{goto jolt_label_10;}
var_2 += 1;
jolt_label_36:
stack_1 = var_2;
stack_2 = var_1;
if (stack_1 < stack_2)
{goto jolt_label_5;}
return _P_;
method_exit: return _P_;
}
This translates into very efficient native code when compiled with
optimizations, and the compiled (and dynamically linked) code runs
in approximately 200ms, compared to 9000ms when the bytecode is
interpreted, for the same arguments to the method. It is about the
same for a similarly coded C loop.Ok, big deal. How are real programs dealt with? Method calls are implemented through the hooks available to native methods. For instance, the canonical hello world program
public class xxy
{ public void doit(int cnt)
{ System.out.println("hello world"); } }
gets bytecoded into a field access followed by a method invocation
0 getstatic #7 < Field java.lang.System.out Ljava/io/PrintStream; > 3 ldc #1 < String "hello world" > 5 invokevirtual #8 < Method java.io.PrintStream.println(Ljava/lang/String;)V > 8 returnwhich is translated into
if (java_lang_System_out_field == -1)
{ if (JoltStaticField(JoltSelfRef,_cp, 7, _EE_) == FALSE) goto method_exit;
java_lang_System_out_field = 1; }
_fb = _cp[7].p;
stack_1 = (int32_t) *(OBJECT *)normal_static_address(_fb);
if (JoltConst_1 == -1)
{ if (JoltResolveConst(_cp, 1, _EE_) == FALSE) goto method_exit;
JoltConst_1 = 1; }
stack_2 = (int32_t) _cp[1].p;
if (java_io_PrintStream_println8_offset == -1)
{ if (JoltVirtualResolve(JoltSelfRef,_cp, 8, _EE_) == FALSE) goto method_exit;
_mb = _cp[8].p;
java_io_PrintStream_println8_offset = _mb->fb.u.offset; }
{ Java8 _tmp;
_mb =
mt_slot(obj_methodtable((Handle *) stack_1),
java_io_PrintStream_println8_offset);
do_execute_java_method(_EE_,(void *)stack_1,NULL,NULL,_mb,FALSE,stack_2);}
if (exceptionOccurred(_EE_)) goto method_exit;
return _P_;
The first pass of the code transfers offsets/pointers to static
variables in the C code, so that subsequent passes use the cached
information. Needless to say, method invocations (partly due to the
fact that do_execute_java_method() packs and unpacks vararg lists)
slows down execution enough to make a long sequence of method
invocations little different from an interpreted approach.As another benchmark on the other spectrum, the compiled version of the translator is about 5% faster than the interpreted version when re-translating itself. This isn't surprising --- running the original bytecode under the profiler reveals that approximately 89% of the time is spent in writing out to files (essentially System.out.println() calls) which of course is not compiled to C by the translator. The second reason seems to be due to a significant portion of the remainder of the time spent in doing (recursive method calls) to determine the stack depth, which is not very well optimized by the translation to C.
Finally, feel free to experiment with benchmarks you have of course, just keep in mind the caveats imposed by the method and not draw too many conclusions either way. I'd love to see how they fare -- you are welcome to contact me if you can hand out the bytecode and are unwilling to mess around with this system.
KB Sriram
Comments, bug reports: kbs@sbktech.org
Revised: Sat May 25 10:18:34 1996
URL: http://www.sbktech.org/genidea.html